Estimate earth fissure hazard based on machine learning in the Qa’ Jahran Basin, Yemen

Earth fissures are potential hazards that often cause severe damage and affect infrastructure, the environment, and socio-economic development. Owing to the complexity of the causes of earth fissures, the prediction of earth fissures remains a challenging task. In this study, we assess earth fissure hazard susceptibility mapping through four advanced machine learning algorithms, namely random forest (RF), extreme gradient boosting (XGBoost), Naïve Bayes (NB), and K-nearest neighbor (KNN). Using Qa’ Jahran Basin in Yemen as a case study area, 152 fissure locations were recorded via a field survey for the creation of an earth fissure inventory and 11 earth fissure conditioning factors, comprising of topographical, hydrological, geological, and environmental factors, were obtained from various data sources. The outputs of the models were compared and analyzed using statistical indices such as the confusion matrix, overall accuracy, and area under the receiver operating characteristics (AUROC) curve. The obtained results revealed that the RF algorithm, with an overall accuracy of 95.65% and AUROC, 0.99 showed excellent performance for generating hazard maps, followed by XGBoost, with an overall accuracy of 92.39% and AUROC of 0.98, the NB model, with overall accuracy, 88.43% and AUROC, 0.96, and KNN model with general accuracy, 80.43% and AUROC, 0.88), respectively. Such findings can assist land management planners, local authorities, and decision-makers in managing the present and future earth fissures to protect society and the ecosystem and implement suitable protection measures.

The YTS formed the lower section of the volcanic Qa' Jahran Basin and evolved from the Oligocene to Miocene   56 , which was connected to the Afar plume that influenced the Arabia-Africa zone during the Oligocene, and to the opening of the Red Sea and the Gulf of Aden. The YTS occurs in intrusions and lava flows in the study area and comprises ignimbrite, dacite, rhyolite, basalt, trachyte, granite, and ash flow 55 . The YTS also forms a range of semi-steep mountains (within 60°) toward the study area's northern, eastern, and western borders. The northern and western sections of the study area had the highest heights of these mountains. They also outcrop at the western boundary of the pilot area and extend to the south 55 . According to the geological map, there was a significant amorphous granite intrusion in the northwestern part of the study region (Fig. 2). The age of this intrusion is equivalent to that of the Tertiary Granite Invasive of Jabal-Bura (Hodeidah), Jabal-Hufashash (AL-Mahwait), and Jabal Saber (Taiz). This intrusion also has a strong relationship with the opening of the Red Sea Rift structure. Based on geochemical and geochronological evidence, YVS was initially cited by Mattash 57 , Beydoun, et al. 58 . The stage after drift (Miocene to Recent) was created, developed, and divided by an unconformity. The allocated TVS age varies between 11.3 and 0.04 MA 55,57 . The YVS was primarily found in the southeast and south of the study region, with small parts in the western part of the study area. Basaltic lava was placed from a significant normal fault on a considerable mass of Ignimbrite-trending NS. In the study area, the quaternary deposits are shown as plains of quaternary loss sediments in the lengthened depressions found in the study area. These deposits are silt, clay, sand, gravel, alluvial and terraces, and basin alluvium. Alluvial quaternary deposits shape dense cumulations in the center of the region, are eroded toward the margins of the basin, and are powerfully deformed and uplifted east of the basin by a normal fault. Half-graben displacement formed the basin structurally, which resulted in an elongated basin bounded to the east by a natural fault 55 . Fig. 3 shows the overall procedure followed in this study to set and implement the planned ML models. The workflow can be summarized in three main steps: (1) preparation of input variable  Fissure inventory map creation. The construction of an event inventory map is the first and most significant stage in modeling 59 . A total of 152 fissure points (values 1) and 152 non-fissure points (values 0) were collected from the field. Thereafter, 70% of the datasets were used for training and 30% for model validation [59][60][61] . The extent of data collection and how to divide the data into appropriate data subsets while avoiding over-fitting problems are difficult to determine; however, the survey and data collection were justified to be reasonable and comprehensive for the study area (Fig. 1).  1,34,39 . Other significant human activities include land-use changes, and vegetation cover deterioration, all of which might influence the occurrence of earth fissures. Furthermore, environmental factors and topographical factors such as faults, drainage density, elevation and lithology can influence the occurrence of earth fissures. Based on the environmental characteristics and availability in the study region, eleven essential factors were considered in this study (Fig. 4), namely, well density (WD), water level (WL), groundwater drawdown (GWDW), elevation, aspect, slope percentage, drainage density (DD), normalized difference vegetation index (NDVI), land subsidence (LS), distance to faults (DTF), and land use. Elevation, slope, and aspect were  . Groundwater data were obtained from 20 National Water Resources Authority (NWRA) observation wells. Groundwater drawdown and water level maps were prepared based on 12 years (2008-2020) at observation wells using ArcGIS v. 10.3 (kriging interpolation using Spatial Analysis Tool). Land Subsidence was identified from January 2020 to April 2020 using InSAR (Interferometric Synthetic Aperture Radar) from Sentinel-1 data using the ESA SNAP Toolbox. The data processing steps are shown in Fig. 5 62,63 .

Figure 2.
Multicollinearity assessment of conditioning factors. The multicollinearity test may improve model results in natural hazard studies by selecting ideal factors for hazard mapping 64 . Multicollinearity refers to the absence of independence of the independent variables and their significant correlations, which can arise in a dataset and mislead an analysis of their incidence 65 . To examine the multicollinearity of independent variables in earth fissure modeling, the tolerance (TOL) and variance inflation factors (VIF) were used in this study 66 . A multicollinearity problem is indicated by a VIF score of 10 or above and a TOL of less than 0.10 67 : where R 2 j is the coefficient of determination.
Applied ML models for earth fissure hazard mapping. To predict earth fissures, the machine learning supervised classification techniques used in CARET packages provided by R. CARET packages provide functions for preprocessing, model training, model prediction, and model evaluation. Once finished preparation of the dependent and independent factors, the dataset (training and testing data) was imported to R. The preprocessing for the dataset encodes the categorical variable (and scaling) as a set of boolean inputs, each representing one category with 0 or 1. After that procedure, the perfect split between predicting which variable would be the best for splitting the decision tree and visualization data to see the relationship between the variables and earth fissure frequency. RF, XGBoost, NB, and KNN models were proposed using all datasets with the best conditioning factors. Models were developed to operate with default settings. Hence, hyperparameters were optimized with multiple values and re-run with recommended tuning parameters that gave us the highest accuracy ( Table 1). The RF algorithm used 500 trees, and the model's best final value was mtry = 7, with a better grid search than random search (Fig. 6a). In the XGBoost algorithm, subsample, min child weight, and eta were found to have a clear enhancement in accuracy. In particular, when the sub-sample produced better accuracy when reaching a value of 1, for minimum child weight produced better accuracy with the value 0 more than the values of 1 and 2, also eta produced the best accuracy with the value of 0.3 more than the values of 0.05 and 1. nrounds, colsample-bytree,   (Fig. 6b). The NB algorithm is used to expand. Grid (FL = c (0), usekernel = T, and adjust = c (0.5) to achieve the highest accuracy ( Fig. 6c). There was not much difference in accuracy between default parameters and tunned hyperparameters. Therefore, we used the parameters default setting because it was less time-consuming. In the KNN model, the default parameters and tunned hyperparameters agree on the value of k = 21; when the value of k increases, the start accuracy decreases again. Of note, optimizing hyperparameters is a critical step in maximizing accuracy efficiency. After that, run the confusion matrix to evaluate the model, plot ROC curves to calculate the values of AUC, and then produce a prediction map using raster data.
Random forest. RF is an ensemble learning algorithm designed to improve the regression and classification of trees by integrating a wide range of decision-making trees 68 . RF is an effective method for managing data vagueness and complexity and has been successfully used to evaluate many complex datasets 50,69,70 . Owing to its robustness, flexibility, and manageability of complex data structures, RF has also been proven to be one of the best-used hazard modeling techniques [71][72][73][74] . The RF algorithm has two solid techniques: random subspace collection and bagging 75 . RF produces binary tree (ntree) classifications using bootstrap samples to replace the raw values. These classification trees participate to unit voting, and the proper classification is a consensus vote for all forest trees. Three key parameters employed in the implementation of the RF technique are the number of trees (ntree), the number of acceptable characteristics for splitting (mtry), and the minimum number of observations in terminal node (node size) (). In this literature, you can find further mathematical details 68,76 .
Extreme gradient boosting. Compared to other algorithms, the XGBoost algorithm has received extensive attention owing to its superior efficiency, excellent learning impact, and efficient training speed 77,78 . The XGBoost algorithm is a gradient-boosting decision tree (GBDT) enhancement technique that is useful for solving regression and classification tasks. XGBoost is a boosting tree algorithm that combines many weak classifiers into a robust classifier. This algorithm works by constantly adding trees and dividing the features in order to grow a tree. A new function that matches the last residual predicted can be learned 78 . Three key parameters, sub-sample (sub-sample ratio of training instance), namely colsample bytree (sub-sample ratio of columns when building each tree), and nrounds (max number of iterations boosting), are used in XGBoost 79,80 .
Naïve Bayes. NB is a simple and widely used algorithm applied in various fields (computer science, earth sciences, text classification, and medicine) 81 . This approach is practical when sample X can be characterized as conjugating conditionally independent attributes 81,82 . Based on Bayesian probability theory, Bayesian learning enables us to compute the posterior probability given the prior chances 83,84 . The primary advantage of the NB model is that it is relatively simple to implement and does not necessitate the use of extensive hyperparameter tuning 84 . The mathematical foundation of NB is strong, and its categorization efficiency is consistent. NB works well with tiny amounts of data, can handle several categorization jobs, and can be trained incrementally. The disadvantage of the NB model is that it is susceptible to how the input data is represented; it is necessary to compute the prior probability 85 .
K-Nearest Neighbor. KNN algorithms are supervised ML algorithms that do not require learning; they are also referred to as lazy algorithms 86 . KNN can be used to handle regression and classification issues 81,85 . KNN computes the k nearest samples utilizing the distance between samples and uses their value to predict the value of the desired sample 81,87 . These k samples are most similar to the sample examined. Once the method has Models testing. Validation is an integral part of the modeling process 50 . Validation is performed in every modeling technique to consider whether the model has achieved reasonably reliable results for the target 88 . Model  www.nature.com/scientificreports/ evaluation helps determine the suitability of the model and the elements that require enhancement 50 . Thus, 30% of the datasets were used to validate the models. The confusion matrix, overall accuracy and area under the receiver operating characteristics (AUROC) curve were considered to validate the earth fissure models in this study.
The receiver operating characteristic (ROC) curve and Kappa index. Analysis can be used to evaluate the performance of earth fissure hazard models. The ROC curve is a graph with varying cut-off thresholds depending on Specificity and Sensitivity. AUROC, a statistical overview of the overall performance of the earth fissure models, is utilized for quantitative comparison 89 . The AUROC expresses the likelihood that the classifier would properly rate a randomly chosen earth fissure pixel as more indicative of an earth fissure than a selected randomly non-earth fissure. When AUROC is equal to 0, it suggests a non-informative model, however; however, when AUROC is equal to 1, it represents a great model that correctly identifies all earth fissure and non-earth fissure pixels 89,90 . The AUROC standard error was used to examine the importance of one classified system having a larger AUROC than another 91 . The model will perform better if the standard error is small. The Kappa index (κ) can be used to determine the trustworthiness of earth fissure models 92,93 . The Kappa index is used to quantify the capacity of earth fissure models to classify earth fissure pixels 94 . It is calculated as the ratio of measured agreement that randomly exceeds the probability of this occurring. According to Landis and Koch 95  Quality parameters and accuracy measure. Five statistical evaluation measures were employed to assess the performance of the trained earth fissure models: accuracy, specificity, sensitivity, negative predictive value, and positive predictive value. Accuracy is defined as the proportion of fissure and non-fissure pixels accurately detected by the producing model. Specificity is the ratio of non-fissure pixels accurately classified as non-fissure. The ratio of fissure pixels accurately identified as fissure occurrences is called sensitivity. The likelihood of pixels correctly identified as non-earth fissure is the negative predictive value. In contrast, the likelihood of pixels correctly identified as fissure is the positive predictive value 89 .
where TP (true positive) and TN (true negative) are the numbers of pixels that are correctly identified, whereas FP (false positive) and FN (false negative) are the numbers of pixels erroneously identified.

Results
The results of factors multicollinearity analysis. The  The results of earth fissure hazard mapping. In this study, the ML models have been used to assess and map earth fissure hazard; all the models were built in R studio software packages (version 3.6.1). After applying the training dataset for the RF, XGBoost NB, and KNN models, earth fissure hazard indices were calculated for all parts of the research area 34 . After applying the training dataset for the RF and XGBoost, NB, and KNN models, earth fissure hazard indices were calculated for all parts of the research area 34 . Earth fissure hazard indices were reclassified into three hazard levels (low, medium, and high) using a similar field classification procedure 34 59 . This study used the AUROC method due to its correspondence, satisfaction, and ability to produce quantitative model estimates. All the models achieved very good to excellent results, with AUROC values found to range from 88 to 99% (Fig. 9). Considering the AUROC process, there was no significant difference in the output between the RF, XGBoost, and NB models. As presented in (Fig. 9), the RF produced an AUROC of 99% and overall accuracy of 95.6%. However, XGBoost had an AUROC of 98% with an overall accuracy of 92.3%. NB produced an AUROC of 96%, with an overall accuracy of 88%. In comparison, the KNN produced an AUROC of 88%, with an overall accuracy of 80.4%. Hence, the RF, XGBoost, and NB algorithms were proven to achieve better hazard modeling. The achieved consistency between the applied model ensures that the model is sufficiently accurate to predict possible future earth fissures over the region. The RF, XGBosst, KNN and NB were evaluated by various statistical measures (Tables 3 and 4). RF model with AUC values of 99% achieved the highest accuracy, followed by XGBoost with AUC values of 98%, NB with AUC 96%, and KNN model with AUC values of 88%, respectively. The models demonstrated excellent results in predicting earth fissures hazard in the study area with AUC > 88%. Furthermore, the kappa index was used to assess the reliability of earth fissure models; the kappa value of the KNN model was found to be 0.608, indicating a "moderate" agreement. Also, the kappa value of the NB model was set at 0.760, indicating a "substantial" agreement. Furthermore, the RF (0.913) and XGBoost (0.847) models have achieved a perfect agreement in terms of Kappa value. The Kappa index value indicates model compatibility and reliability; additionally, there is a high degree of congruence between models and reality. In the case of the classification of the earth fissure zone, the highest predictive positive value was observed in the RF model, indicating the model's likelihood of classifying the earth fissure zone better in 97.83% of situations. Compared with the RF model, the NB model achieved a value of 92.68%, followed by the XGBoost model (88.24%) and the KNN model (86.84%), respectively. Additionally, the XGBoost model achieved the highest negative predictive value (97.56%), which indicates that the likelihood of correctly classifying the non-fissure zone was 97.56%. However, the RF model achieved 93.75%, followed by the NB model (84.31%) and the KNN model (75.93%), respectively. In the case of classification of earth fissure pixels, the XGBoost model produced the highest sensitivity (97.83%), revealing that 97.83% of earth fissure pixels were correctly rated as earth fissures, while the RF model correctly rated 93.48%, followed by the NB model (82.61%), and KNN model (71.74%), respectively. Additionally, the RF model achieved the highest specificity (97.83%), which indicated that 97.83% of the non-earth fissure region was adequately defined as a non-earth fissure. In contrast, the NB model achieved a specificity of 93.48%, followed by the KNN model (89.13%) and the XGBoost model (86.96%), respectively. Overall, four modes of earth fissure achieve better results in the classification of earth fissure and non-earth fissure pixels. Overall, in this study, four earth fissure models are acceptable, and the RF model displays the most stable and efficient results among all models.
Analysis of conditioning factors importance. The sensitivity and significance of every earth fissure conditioning factor are essential outputs used to calculate the earth fissure hazard map 34 . The OOB in the RF was used to rank the significance of the conditioning factor during the model training process (Fig. 10). For the RF model, well density was the most important factor, followed by elevation, groundwater drawdown, water level, distance to faults, drainage density, land subsidence, NDVI, slope, land use, and aspect. The most important factor in XGBoost was elevation, followed by well density, water level, distance to fault, groundwater drawdown, land subsidence, drainage density, NDVI, slope, land use, and aspect. In contrast, the most important factor for NB and KNN was well density, NDVI, slope, water level, land subsidence, elevation, groundwater drawdown, drainage density, aspect, land use, and distance to the fault. In the case of KNN and NB, well density is the most important factor for predicting earth fissures, followed by NDVI, slope, water level, land subsidence, elevation,  www.nature.com/scientificreports/ and groundwater drawdown. Our results are aligned with those of previous studies, demonstrating that excessive water withdrawal, high well density, and high groundwater extraction contribute to earth fissures 3,34,96 . In contrast, aspect and land use were identified as the least important factors. The modelling strategy affects the comparative relevance of the predictor variables to earth fissure modeling 89 . Therefore, for one model, predictive variables of high relative importance may be useless for another model. Thus, in different models, the importance of a predictor variable can differ from each other 89 .

Discussion
Evaluation of factors of importance. Evaluating the significance of predictors factors is useful for environmental managers responsible for allocating and planning scarce natural resource management resources 97 . Brown and Nicholls 98 emphasized the importance of analyzing the relationship among land subsidence, earth fissures, and environmental factors since it enables the planners to focus on human activities' influence. While numerous analytical and expert opinion-based methods for analyzing natural hazards have been presented, the relative impact of geo-environmental variables continues to be discussed 99 . In general, decision-makers have benefited from fresh insights into the linkages between hydro-geological and geo-environmental factors to ML algorithms, as well as the occurrence of earth fissures, and they are now viewed as a convenient tool capable of effectively contributing to environmental management improvement 100 . The modelling strategy affects the comparative relevance of the conditioning factors to earth fissure modeling. Therefore, for one model, predictive factors of high relative importance may be useless for another model. Thus, in different models, the significance of predictor factors can differ from each other 89 . The comparative relevance of predictors was determined in our study using the RF, XGBoost, NB, and KNN models. The analysis determined that well density is the main factor for predicting earth fissures in RF, NB, and KNN models. These results align with those of previous studies, demonstrating that excessive water withdrawal, high well density, and high groundwater contribute to the issue of earth fissures 3,34,96 . The intense groundwater withdrawal has resulted in a catastrophic fall in potentiometric levels, putting aquifer systems under strain and stress, eventually resulting in earth fissures and land failures 101 . Ground deformation caused by pumping occasionally happens in aquifer systems with poorly cemented sediments 102 . As Burbey 103 discusses, favourable conditions for land subsidence and earth fissures include the following: (1) long-period groundwater extraction that causes a large drop in the water table, (2) materials that are thick and compressible, and (3) failures of the tectonic plates (e.g., faults) and geological discontinuities that allow for the buildup of stress. Thus, excessive groundwater extraction should be severely restricted in the majority of the study area to prevent the development of earth fissures. Additionally, artificial recharging of groundwater (ARG) initiatives improve an aquifer's water balance, potentially reducing the hazards of earth fissures. In all four models, the aspect and land use were recognized as the least important factors.
Predictive performance of models. Modeling and simulation can help us learn more about environmental threats and make better decisions. The structure of modeling methods, on the other hand, varies significantly, resulting in a wide range of outputs and predictive performance. Model-based spatial predictions are presently recognized as a critical aim of ecological and geo-environmental research since they will guide managers' and environmental planners' decision-making 104 . Notably, the diversity of modeling techniques allows planners to become conscious, comprehend and build effective environmental plans 105 . According to Araujo and Guisan 106 , even while the same model types are used in various sectors, there may be heterogeneity in the forecasts and results. Thus, comparative studies are necessary to analyze models' performance in similar environments and accurately appraise their abilities 107 . This work explored four machine learning approaches (RF, XGBoost, NB, and KNN) to determine the most accurate way to assess earth fissure locations. The RF model (AUROC = 99%) overcomes the XGBOOST (AUROC = 98%), NB (AUROC = 96%), and KNN (AUROC = 88%) models in terms of AUC values. There was no discernible difference in predictive performance or goodness-of-fit www.nature.com/scientificreports/ between models. The RF model outperforms the XGBoost, NB, and KNN models in terms of performance. The KNN method predicted that most central areas have a medium or high occurrence of earth fissures. In contrast, the RF, XGBoost, and NB models classified these areas as having a lower occurrence of earth fissures. In general, the four models agreed to predict most hazard areas. Moreover, the high-hazard areas align with where earth fissures occur in the basin. The achieved consistency between the applied model ensures that the model is sufficiently accurate to predict possible future earth fissures over the region. Few research studies have been conducted to investigate the performance of RF, XGBoost, NB, and KNN models in geo-environmental fields (e.g., landslide, air quality, and flash flood) 80,86,[108][109][110][111] . However, it is difficult to directly compare our findings to this research because XGBoost, NB, and KNN models were not previously utilized to predict earth fissures. Therefore, our models' performance was compared with the same models in other hazard assessment applications. High-accuracy models have been highlighted in the literature 110,111 , and they found RF to be the most successful model. This outcome is in line with what we found in our research. KNN was also found to be as effective 112 . Naghibi et al. 111 also emphasized the KNN model's higher accuracy. In a study comparing the performance of ML algorithms for flood susceptibility prediction, Madhuri et al. 113 found that XGB outperformed KNN and that RF, XGBoost, and NB outperformed KNN as well. Although it had the lowest predicted accuracy of the three ways evaluated, the KNN method was similarly beneficial. Of note, the results obtained in this study were found to coincide with similar work on the exact application nature 34 . They introduced a compared several models for predicting earth fissure hazards. They found the RF model best predicted the earth fissure hazard. Notably, both tree-based (RF and XGBoost) models performed well to well, demonstrating their overall capability for modeling earth fissures. According to França et al. 105 , whereas linear modeling approaches usually fail to satisfy a variety of statistical assumptions, such as variable independence and variable statistical distribution, tree-based models frequently escape these limits. Tree-based ML models, notably RF and XGBoost, were shown in this work to be capable of uncovering complex nonlinear relationships. These results are consistent with similar findings in 84 , which provided an extensive comparison of various ML models, where it was found that tree-based models are superior to other ML models. As it turned out, the fundamental drawback of single-tree models was overcome by fitting multiple trees in RF, and XGBoost models 114,115 . As a result, based on the models' performance and ease of interpretation, this study shows that the chosen models are genuinely possible. Advanced environmental hazard analyses are needed as the increasing human population leads to high demand for shelters and infrastructure. Further accurate studies are required in order to identify hazard-free zones.

Conclusion
In this study, we used four ML algorithms (e.g., RF, XGBoost, NB, and KNN) to model and forecast earth fissure hazard levels and classify the key processes leading to the hazard in the Qa' Jahran Basin, Yemen. The results show that approximately 7.34-21.30% of the overall area was found to be highly vulnerable to earth fissure hazard, 5.25-37.66% to a medium hazard level, and 41.03-85.52% to low hazard level. The region's most sensitive to earth fissure hazards were found in the northern part of the basin. The most significant applied conditioning factors were well density, land subsidence, groundwater drawdown, distance to fault, and geology. The study region increased agricultural and residential areas, and the primary water source is groundwater. For these types of land, groundwater exploitation is exceptionally high, and many unregulated deep wells are mainly used for farming purposes in the Qa' Jahran Basin. However, the region's continuous development and urbanization pose significant questions regarding the ability to satisfy future water demand and the resulting dangers of earth fissures. The field is still considered tectonically active, another source and origin of earth fissures. Although this study attempted to incorporate all accessible and relevant data for earth fissure modeling, more factors could be considered, such as sediment thickness and others, which may contribute to better prediction accuracy. On the other hand, the fissure sample was prepared using simply the locations of the fissure events, not the date of the event. However, the timing of the fissuring may represent the effect of the change in some factors, such as land use changes and groundwater fluctuations, on the event of fissuring. This is also an intriguing area for further research. In addition, ML often encounters classification issues. We aim to predict the class label in a classification task by examining the predictor when the target or output variable is categorical. Data imbalance problems may arise in this case and often yield inappropriate results. In the future study, we will discuss the imbalanced dataset, the problem regarding its prediction, and how to deal with such data more efficiently than the conventional ML approaches.
This research provides helpful insights for future studies on detecting earth fissures using ML algorithms. The hazard maps of earth fissures and knowledge of hazardous locations would support decision-makers as a roadmap to make appropriate decisions for handling and tracking the potential losses incurred by the vulnerable environment. The results could also be helpful for water resource managers to enable sound decision-making for groundwater withdrawal regulations. More potential research is required to study the existence of natural hazards.

Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Code availability
RStudio desktop (version 1.3.9), and R (version 3.6.2) were used for statistical analysis and running the models. All R Studio codes are available upon reasonable request.